3 research outputs found
Driving with Style: Inverse Reinforcement Learning in General-Purpose Planning for Automated Driving
Behavior and motion planning play an important role in automated driving.
Traditionally, behavior planners instruct local motion planners with predefined
behaviors. Due to the high scene complexity in urban environments,
unpredictable situations may occur in which behavior planners fail to match
predefined behavior templates. Recently, general-purpose planners have been
introduced, combining behavior and local motion planning. These general-purpose
planners allow behavior-aware motion planning given a single reward function.
However, two challenges arise: First, this function has to map a complex
feature space into rewards. Second, the reward function has to be manually
tuned by an expert. Manually tuning this reward function becomes a tedious
task. In this paper, we propose an approach that relies on human driving
demonstrations to automatically tune reward functions. This study offers
important insights into the driving style optimization of general-purpose
planners with maximum entropy inverse reinforcement learning. We evaluate our
approach based on the expected value difference between learned and
demonstrated policies. Furthermore, we compare the similarity of human driven
trajectories with optimal policies of our planner under learned and
expert-tuned reward functions. Our experiments show that we are able to learn
reward functions exceeding the level of manual expert tuning without prior
domain knowledge.Comment: Appeared at IROS 2019. Accepted version. Added/updated footnote,
minor correction in preliminarie
Driving Style Encoder: Situational Reward Adaptation for General-Purpose Planning in Automated Driving
General-purpose planning algorithms for automated driving combine mission,
behavior, and local motion planning. Such planning algorithms map features of
the environment and driving kinematics into complex reward functions. To
achieve this, planning experts often rely on linear reward functions. The
specification and tuning of these reward functions is a tedious process and
requires significant experience. Moreover, a manually designed linear reward
function does not generalize across different driving situations. In this work,
we propose a deep learning approach based on inverse reinforcement learning
that generates situation-dependent reward functions. Our neural network
provides a mapping between features and actions of sampled driving policies of
a model-predictive control-based planner and predicts reward functions for
upcoming planning cycles. In our evaluation, we compare the driving style of
reward functions predicted by our deep network against clustered and linear
reward functions. Our proposed deep learning approach outperforms clustered
linear reward functions and is at par with linear reward functions with
a-priori knowledge about the situation.Comment: To appear in Proceedings of the IEEE International Conference on
Robotics and Automation (ICRA), Paris, France, June 2020 (Virtual
Conference). Accepted version. Corrected figure fon